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For better or for worse, rankings of institutions, such as universi- 
ties, schools and hospitals, play an important role today in conveying 
information about relative performance. They inform policy decisions 
and budgets, and are often reported in the media. While overall rank- 
ings can vary markedly over relatively short time periods, it is not un- 
usual to find that the ranks of a small number of "highly performing" 
institutions remain fixed, even when the data on which the rankings 
are based are extensively revised, and even when a large number of 
new institutions are added to the competition. In the present paper, 
we endeavor to model this phenomenon. In particular, we interpret 
as a random variable the value of the attribute on which the ranking 
should ideally be based. More precisely, if p items are to be ranked 
then the true, but unobserved, attributes are taken to be values of p 
independent and identically distributed variates. However, each at- 
tribute value is observed only with noise, and via a sample of size 
roughly equal to n, say. These noisy approximations to the true at- 
tributes are the quantities that are actually ranked. We show that, 
if the distribution of the true attributes is light-tailed (e.g., normal 
or exponential) then the number of institutions whose ranking is cor- 
rect, even after recalculation using new data and even after many 
new institutions are added, is essentially fixed. Formally, p is taken 
to be of order n° for any fixed C > 0, and the number of institutions 
whose ranking is reliable depends very little on p. On the other hand, 
cases where the number of reliable rankings increases significantly 
when new institutions are added are those for which the distribution 
of the true attributes is relatively heavy-tailed, for example, with tails 
that decay like x~ a for some a > 0. These properties and others are 
explored analytically, under general conditions. A numerical study 
links the results to outcomes for real-data problems. 

1. Introduction. There are many contemporary settings in which rank- 
ing plays an important role. For example, universities, schools and hospitals 
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are regularly ranked in a variety of contexts, the results of which typically 
generate interest and can often drive policy decisions. In many of these situa- 
tions, a given ranking can carry a high degree of uncertainty, with this effect 
particularly pronounced in high-dimensional cases; that is, where there are 
very many populations or institutions to be ranked. 

Despite this, one feature of many rankings reported over time is that the 
ordering at the extreme top or bottom remains relatively invariant. For ex- 
ample, in the THE-QS university rankings, 1 Harvard University has ranked 
first for each of the years 2005-2008, while New York University's rankings 
are 56, 43, 49 and 40. If we believe that the observed data used for ranking 
are measures of true underlying values, distorted by noise, then we can rein- 
terpret this behavior as a tendency to obtain correct rankings at extremes, 
but not otherwise. It is this phenomenon that we explore in this paper, using 
both theoretical and numerical arguments. 

Intuitively, this behavior has a natural explanation. Those scores at the 
extreme of a range are more likely to be sufficiently "spaced out" to over- 
come the problems of data noise, whereas less extreme scores are likely to 
be bunched more closely together. We introduce models that describe this 
behavior and explore their properties. Related to this, it turns out that one 
important consideration for correct ranking at the extremes is whether the 
possible scores used for ranking have infinite support but nevertheless have 
light tails. If this is the case and the tail of the distribution of the underlying 
scores is smooth, we can expect accurate ranking of the top portion of the 
institutions, even when dimension is very large. Moreover, even when the 
support is bounded, there remains potential for correct ranking at extremes, 
although now there is greater likelihood that the ranking will change if new 
institutions are added. Such results have a variety of practical implications; 
we briefly present two of these here, with more detail provided in the nu- 
merical section. 

Example 1 (University rankings). Suppose we attempt to rank univer- 
sities and other research institutions by counting how many papers their 
faculty members publish in Nature 2 each year. This is a high-dimensional 
example due to the large number of institutions competing to be published. 
Figure 1 shows the ranking of the top 50 institutions on this measure. The 
institutions are aligned along the horizontal axis, with the each dot denot- 
ing the point estimate of the rank and the vertical line a corresponding 
estimated 90% prediction interval. The four plots show how the confidence 
intervals change as we increase the number of years, n, of data used for the 
ranking. 



www.topuniversities.com. 
2 www.nature.com / nature / index.html. 
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Fig. 1. Prediction intervals for top-ranked universities based on publications in Nature, 
averaged over various numbers of years. 



The two main observations are that the prediction intervals are widest 
when a smaller number of years are considered, and that the prediction 
intervals for the highest ranked universities are the smallest. In fact, the 
intervals are small enough in the extremes to give us genuine confidence in 
that aspect of the ranking. Even when n = 1, we can be reasonably sure 
that the top ranked institution (Harvard University) is in fact ranked cor- 
rectly. When n = 15, the top four universities are known with a high degree 
of certainty, and the next set of ten or so is fairly stable too. Thus, it is 
possible to have correctness in the upper extreme of this ranking, even when 
the lower ranks remain highly variable. In the present paper, we model this 
phenomenon by addressing the underlying stochastic properties of the insti- 
tutions; the data provide only a noisy measure of this random process, and 
we assess the impact of the noise on the ranking. 

Example 2 (Microarray data). We take the colon microarray data first 
analysed by Alon et al. (1999). It consists of 62 observations in total, each 
of which indicates either a normal colon or a tumor. For each observation, 
there are also expression levels for p = 2000 genes. It is of interest to de- 
termine which genes are most closely related to the response, so that they 
can be investigated further. This of course amounts to a ranking and we are 
interested in stability at the extreme, since we seek only a small number 
of genes. Here, the genes are ranked based on the Mann-Whitney U test 
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Fig. 2. Prediction intervals for top-ranked genes in colon dataset. 



statistic, which is a nonparametric assessment of the difference between the 
two distributions. 

Figure 2 plots the top 30 genes, ranked by the lower tail of an estimated 
90% prediction interval, rather than the point estimate of the rank. In this 
situation, we cannot authoritatively conclude that any of the top genes are 
ranked exactly correctly, but the top four genes appear much more stable 
than the others. This stability is highly important; if the length of all predic- 
tion intervals were roughly the same as the average length (1400 genes), then 
there would be little hope of discovering useful genes from such datasets. 

There is a literature on the bootstrap in connection with rankings. See 
Goldstein and Spiegelhalter (1996), who discuss bootstrap methods for con- 
structing prediction intervals for rankings; Langford and Leyland (1996), 
who address bootstrap methods for ranking the performance of doctors; 
Cesario and Barreto (2003), Hui, Modarres and Zheng (2005) and Taconeli 
and Barreto (2005), who take up the problem of bootstrap methods for 
ranked set sampling; Mukherjee et al. (2003), who develop methods for gene 
ranking using bootstrapped p-values; and Xie, Singh and Zhang (2009) and 
Hall and Miller (2009), who focus on consistent bootstrap methods for assess- 
ing rankings. More generally, there is a vast literature on ranking problems 
in statistics, and we cite here only the more relevant items since 2000. Joe 
(2000, 2001) discusses ranking problems in connection with random utility 
models, and points to connections to multivariate extreme value theory. Mur- 
phy and Martin (2003) develop mixture-based models for rankings. Mease 
(2003) and Barker et al. (2005) treat methods for ranking football players. 
McHale and Scarf (2005) study the problem of ranking immunisation cov- 
erage in U.S. states. Brijs et al. (2006, 2007) introduce Bayesian models for 
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the ranking of hazardous road sites, with the aim of better scheduling road 
safety policies. Chen, Stansy and Wolfe (2006) discuss ranking accuracy in 
ranked-set sampling methods, and Opgen-Rhein and Strimmer (2007) ex- 
amine the accuracy of gene rankings in high-dimensional problems involv- 
ing genomic data. Nordberg (2006) addresses the reliability of performance 
rankings. Corain and Salmaso (2007) and Quevedo, Bahamonde and Luaces 
(2007) discuss ways of constructing rankings. 

Section 2 describes our model for the ranking problem, and discusses 
the main properties of this framework. The formal theoretical results which 
underpin the discussion in Section 2 are given in Section 3. Section 4 presents 
simulated and real-data numerical work, including details on the examples 
presented above. Technical proofs are deferred to Section 5. 

2. Model. We consider a set of underlying parameters 9i,...,8 p corre- 
sponding to the objects to be ranked, hereafter referred to as items. The error 
in the estimation is controlled by the number of observed data points, n. In 
our analysis, we take p = p{n) to diverge with n as the latter increases. An 
obvious difficulty here is in establishing where the newly added items should 
fit into the ranking. A natural solution is to take the Oj's to be randomly 
generated from some distribution function. In the setup below, we interpret 
the Oj's as values of means; see the end of this section for generalizations. 

Let ©i,..., p denote independent and identically distributed random 
variables, and write 

(2.1) ( i) <---<e (p) 

for their ordered values. There exists a permutation R = (R\, .. ., Rp) of 
(1, . . . ,p) such that @rj\ = for 1 < j < p. If the common distribution of 
the Qj's is continuous, then the inequalities in (2.1) are all strict and the 
permutation is unique. 

We typically do not observe the 0?'s directly, only in terms of noisy 
approximations which can be modelled as follows. Let Qi = (Qa, . . . ,Qi p ) 
denote independent and identically distributed random p- vectors with finite 
variance and zero mean, independent also of © = (0i, . . . , p ). Suppose we 
observe 

(2.2) Xi = (Xn,. . . , Xi p ) = Qi + @ 
for 1 < i < n. The mean vector 

1 n 

(2.3) X = (X 1 ,...,X p ) = -Y^X i = Q + G 

i=i 

is an empirical approximation to 0. (Here, Q = n~ l ^2 i Qi equals the mean 
of the p-vectors Qi.) The components of X can also be ranked, as 

(2-4) %)<•■•<%, 
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and there is a permutation R\, . . . , R p of 1, . . . ,p such that = for 

each j. If the common distribution of the 0«-'s is continuous then, regardless 
of the distribution of the components of Qi, the inequalities in (2.4) are 
strict with probability 1. 

The permutation R = (R\, . . . , Rp) serves as an approximation to R, and 
we wish to determine the accuracy of that approximation. In particular, for 
what values of jo = jo(n,p), and for what relationships between n and p, is 
it true that 

(2.5) P(R j =R j torl<j<jo)-H 

as n and p diverge? That is, how deeply into the ranking can we go before 
the connection between the true ranking and its empirical form is seriously 
degraded by noise? 

The answer to this question depends to some degree on the extent of 
dependence among the components of each Qi. To elucidate this point, let 
us consider the case where all the components of Qi are identical; this is 
an extreme case of strong dependence. Then the components of Q are also 
identical. Clearly, in this setting Rj = Rj for each j, and so (2.5) holds in 
a trivial and degenerate fashion. Other strongly dependent cases, although 
not as clear-cut as this one, can also be shown to be ones where Rj = Rj 
with high probability for many values of j. 

The case which is most difficult, that is, where the strongest conditions 
are needed to ensure that (2.5) holds, occurs when the components of Qi 
are independent. To emphasize this point we give sufficient conditions for 
(2.5), and show that when the components of each Qi are independent, those 
conditions are also necessary. Our arguments can be modified to show that 
the conditions continue to be necessary under sufficiently weak dependence, 
for example if the components are m-dependent where m = m(n) diverges 
sufficiently slowly as n increases. 

The assumptions under which (2.5) holds are determined mainly by the 
lower tail of the common distribution of the Qj's. If that distribution has 
an exponentially light left-hand tail, for example, if the tail is like that of 
a normal distribution, then a sufficient condition for (2.5) is that jo should 
increase at a strictly slower rate than n 1//4 (logn) c , where the constant c, 
which can be either positive or negative, depends on the rate of decay of 
the exponential lower tail of the distribution of ©. For example, c = if the 
distribution decays like e"^' in the lower tail, and c = — ^ if it is normal. 
As indicated in the previous paragraph, the condition jo = o{n 1//4 (logn) c } 
is also necessary for (2.5) if the components of the Qj's are independent. 

These results have several interesting aspects, including: (a) the expo- 
nent j in the condition jo = o{n 1 / 4 (logn) c } does not change among differ- 
ent types of distribution with exponential tails; (b) the exponent is quite 
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small, implying that the empirical rankings Rj quite quickly become unre- 
liable as predictors of the true rankings Rj\ and (c) the critical condition 
jo — o{n 1 / 4 (logn) c } does not depend on the value of p. (We assume that p 
diverges at no faster than a polynomial rate in n, but we impose no upper 
bound on the degree of that polynomial.) 

The condition on jo such that (2.5) holds changes in important ways if 
the lower tail of the distribution of the Oj's decays relatively slowly, for ex- 
ample, at the polynomial rate x~ a as x — > oo. Examples of this type include 
Pareto, nonnormal stable and Student's t distributions, and more generally, 
distributions with regularly varying tails. Here a sufficient condition for (2.5) 
to hold is jo = o{(n a//2 p) 1 ^ 2a+1 )}, and this assumption is necessary if the 
components of the Q^s are independent. In this setting, unlike the expo- 
nential case, the value of dimension, p, plays a major role in addition to the 
sample size, n, in determining the number of reliable rankings. 

In practical terms, a major way in which this heavy-tailed case differs 
from the light-tailed setting considered earlier is that if a polynomially large 
number of new items are added to the competition in the heavy-tailed case, 
and all items are reranked, the results will change significantly and the 
number of correct rankings will also alter substantially. By way of contrast, 
if a polynomially large number of new items are added in the light-tailed, 
or exponential, case then there will again be many changes to the rankings, 
but now there will be relatively few changes to the number of items that are 
correctly ranked. 

The exponential case can be regarded as the limit, as q-> oo, of the poly- 
nomial case. More generally, note that as the left-hand tail of the common 
distribution of the 6j's becomes heavier, the value of jo can be larger before 
(2.5) fails. That is, if the distribution of the 0/s has a heavier left-hand 
tail then the empirical rankings Rj approximate the true rankings Rj for a 
greater number of values of j, before they degenerate into noise. 

The analysis above has focused on cases where the ranks of the Xj's 
are estimated by ranking empirical means of noisy observations of those 
quantities; see (2.4). However, similar results are obtained if we rank other 
measures of location. Such a measure need only satisfy moderate deviation 
properties similar to (5.3) and (5.4) in the proof of Theorem 1. Thus, the 
results are applicable to a wide range of ranking contexts. For example, L q 
location estimators for general q > 1 enjoy moderate deviation properties un- 
der appropriate assumptions. Therefore, if we take the variables Qij to have 
zero median, rather than zero mean, and continue to define Xi by (2.2) but 
replace the ranking in (2.4) by a ranking of medians, then the results above 
and those in Section 3 continue to hold, modulo changes to the regularity 
conditions. Other suitable measures include the Mann- Whitney test used in 
the genomic example, quantiles and some correlation-based measures. 
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The model suggested by (2.2), where data on O arise in the form of p- 
vectors X\, . . . ,X n , is attractive in a number of high-dimensional settings, 
for example, genomics. There, the jth component Xij of Xi would typi- 
cally represent the expression level of the jth gene of the ith individual in 
a sample. However, in other cases the means X%,. . . ,X p at (2.3), or medi- 
ans or other location estimators, might be computed from quite different 
datasets, one for each component index j. Moreover, those datasets might 
be of different sizes, m,. .. ,n p say, and then the argument that they arise 
naturally in the form of vectors would be inappropriate. This can happen 
when data are used to rank items, for example schools where the ranking is 
based on individual student performance. The conclusions discussed earlier 
in this section, and the theoretical properties developed in Section 3 below, 
continue to apply in this case provided there is an "average" value, n say, 
of the rij's which represents all of them, in the sense that 

(2.6) n = 0[ min nA and max n, = O(n) 

Vi<i<p / i<j<p 

as n diverges. Additionally, in such cases it is often realistic to make the 
assumption that the corresponding centred means (or medians, etc.) Qj = 
n^ 1 Qij are stochastically independent of one another, and so the par- 
ticular results that are valid in this case are immediately available. 

The distribution of the 6/s has been taken to be continuous. This is 
usually appropriate although there can be contexts in which the distribution 
is discrete. Note that assumption of discreteness of the 0j's is different 
from that of discreteness of the observations Xij. In such cases, the analysis 
still holds, except that allowance must be made for ties (any reordering 
of tied 6j's is still "correct"), and the tail density assumptions should be 
characterized in integral form. 

The model has been set up so that it focuses on the populations with 
lowest parameters Qj. Obviously, similar arguments apply to the largest 
parameters too, so the results are applicable to both the most highly and 
lowly ranked populations. 

3. Theoretical properties. For the most part, we shall assume one of two 
types of lower tail for the common distribution function, F, of the random 
variables Qj: either it decreases exponentially fast, in which case we suppose 
that F(—x) x x@ exp(— Cqx°) as x — > oo, where a > and — oo < f3 < oo; or 
it decreases polynomially fast, in which case F(—x) x x~ a as x — > oo, where 
Co, a > 0. [The notation f(x) x g(x), for positive functions / and g, will be 
taken to mean that f(x)/g(x) is bounded away from zero and infinity asx^ 
oo.] The former case covers distributions such as the normal, exponential 
and Subbotin; the latter, distributions such as the Pareto, Student's t and 
nonnormal stable laws (e.g., the Cauchy). 
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It is convenient to impose the shape constraints on the densities, which we 
assume to exist in the lower tail, rather than on the distribution functions. 
Therefore, we assume that one of the following two conditions hold as x — > 
oo: 

(3.1) (d/dx)F(-x) x (d/dx)xP exp(-GV a ), 

(3.2) (d/dx)F(-x) x (d/dx)x~ a . 

In both (3.1) and (3.2), a must be strictly positive, but /3 in (3.1) can be 
any real number. The constant Co in (3.1) must be positive. We assume too 
that 

for fixed constants C\, . . . , C5 > 0, where C2 > 2(C\ + 1) and C4 < 

(3.3) C 5 , p = 0(n Cl ) as n -> 00, and, for each j > 1, E\Qj\ C2 < C 3 , 
E(Qj) = 0, and E(Q)) € [C 4 ,C 5 ]. 

Recall from Section 1 that we wish to examine the probability that the 
true ranks Rj, and their estimators Rj, are identical over the range 1 < 
j < jo- We consider both jo and p to be functions of n, so that the main 
dependent variable can be considered to be n. With this interpretation, 
define 

^xp = ^x P (n)=n 1 / 4 (logn)« 1 /«)-i}/2 ) 

(3.4) 

^poi = ^ P oi(n) = K/ 2 p) 1 /(2«+i), 

where the subscripts denote "exponential" and "polynomial," respectively, 
and refer to the respective cases represented by (3.1) and (3.2). In the the- 
orem below, we impose the additional condition that, for some e > 0, 

(3.5) n = 0(p 4 - £ ). 

This restricts our attention to problems that are genuinely high dimensional, 
in the sense that, with probability converging to 1, not all the rankings are 
correct. Cases where p diverges sufficiently slowly as a function of n are 
easier and will generally permit all ranks to be correctly determined with 
high probability. Assumption (3.5) is also very close, in both the exponential 
and polynomial cases, to the basic condition jo < p, as can be seen via a little 
analysis starting from (3.6) and (3.7) in the respective cases; yet, at the same 
time, (3.5) is suitable to both cases, and so helps to unify our account of 
their properties. Note too that (3.5) implies that, in both the exponential 
and polynomial 0{p l ~ s ) and ^poi = 0(p 1 s ) for some 5 > 0. 

Theorem 1. Assume (3.3), (3.5) and that either (a) (3.1) or (b) (3.2) 
holds. In case (a), if 



(3.6) 



jo = o(f cxp ) 
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as n —¥ oo then (2.5) holds. Conversely, when the components of the vectors 
Qi are independent, (3.6) is necessary for (2.5). In case (b), if 

(3.7) jo = o(f po i), 

then (2.5) holds. Conversely, when the components of the vectors Qi are 
independent, (3.7) is necessary for (2.5). 

It can be deduced from Theorem 1 that when a new item (e.g., an insti- 
tution) enters the competition that leads to the ranking, we are still able 
to rank the top jo institutions correctly. In this sense, the institutions that 
make up the cohort of size jo do not need to be fixed. 

It is also of interest to consider cases where the common distribution, F, 
of the 0j's is bounded to the left, for example, where F(x) x x a as x J. 0. 
However, it can be shown that in this context, unless p is constrained to be 
a sufficiently low degree polynomial function of n, very few of the estimated 
ranks Rj will agree with the correct values Rj. 

To indicate why, we first recall the model introduced in Section 1, where 
the estimated ranks Rj are derived by ordering the values of Qj + Qj . Here 
Qj = n~ 1 J2i<i<n Qij i s the average value of n independent and identically 
distributed random variables with zero mean. Therefore the means, Qj, are 
of order n -1 / 2 . By way of contrast, if we take a = 1 in the formula F(x) x x a 
as x J. 0, for example, if F is the uniform distribution on [0, 1] , then the spac- 
ings of the order statistics 0m < • • • < Q( p ) are approximately of size p . 
(More concisely, they are of size Z/p where Z has an exponential distribu- 
tion; an independent version of Z is used for each spacing.) Therefore, if p is 
of larger order than n 1 / 2 then the errors of the "estimators" Qj + Qj of Qj, 
for 1 < j < p, are an order of magnitude larger than the spacings among the 
Qj's. This can make it very difficult to estimate the ranks of the ©j's from 
the ranks of values of Qj + Qj. Indeed, it can be shown that, in the difficult 
case where the components of the Qi's are independent, and even for fixed 
jo, if a = 1 and p is of larger order than n 1 ' 2 then in contrast to (2.5), 

(3.8) P(Rj = Rj for 1 < j < j ) 0. 

This explains why, when F(x) xx a , it can be quite rare for the estimated 
ranks Rj to match their true values. Indeed, no matter what the value of a 
and no matter what the value of jo, property (2.5) will typically fail to hold 
unless p is no greater than a sufficiently small power of n, in particular unless 
p = o(n OL l 2 \ as the next result indicates. Thus, the differences between the 
cases of bounded and unbounded distributions are stark, as can be seen by 
contrasting Theorem 1 with the properties described below. 
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Theorem 2. Assume that (d/dx)F(x) x x a ~ x as x | 0, where a > 0, 
and t/iai (^.S) holds. Part (a): instances where (2.5) holds and p 2 /n a — > 0. 
Under the latter condition, (i) if a<\ then (2.5) holds even for jo = p; (ii) 
if a = I} then (2.5) holds provided that 

(3.9) (logjo) 2 V/™ a )^0; 
and (iii) if a>\ then (2.5) holds provided that 

(3.10) io = o{(n a / 2 /p) 1/(2Q_1) }- 

Part (b): converses to (a) (ii) and (a) (iii). Ifp 2 /n a ^-0 and the components 
of the vectors Qi are independent then, if (2.5) holds, so too does (3.9) (if 
a = \) or (3.10) (if a > \). Part (c): instances where (3.8) holds. If a > 
and p 2 /n a —> oo, and if the components of the vectors Qi are independent, 
then (3.8) holds even for jo = l. 

The proof of Theorem 2 is similar to that of Theorem 1, and so is omitted. 
Theorem 1 is derived in Section 5. Both results continue to hold if the sample 
from which Xj is computed is of size rij for 1 < j < p, rather than n, provided 
that (2.6) holds. 

4. Numerical properties. This section discusses three real-data and three 
simulated examples linked to the theoretical properties in Section 3. The 
real-data examples make use of the bootstrap to create prediction intervals 
[Xie, Singh and Zhang (2009), Hall and Miller (2009)]. In each simulated 
example, the error is relatively light-tailed, and any discussion of tails refers 
to the distribution of the 0/s. In our real-data examples, the noise has been 
averaged and so it is also generally light-tailed. Thus, any heavy-tailed be- 
havior present in the real-data examples is likely to be due to heavy tails of 
the distribution of the Qj's, rather than the noise. 

Example 1 (Continued). The originating institutions of Nature arti- 
cles were obtained using the ISI Web of knowledge database 3 for each of 
the years 1999 through 2008. A point ranking was obtained by taking the 
average number of articles published per year. Of course, there are implicit 
simplifying assumptions in doing this, most significantly concerning the in- 
dependence of articles between years, and the stationarity of means time. 
These assumptions appear reasonable in context, and are consistent with 
most publication-based analyses. 

When constructing prediction intervals the bootstrap resamples for each 
institution were drawn independently, conditional on the data. [See Hall and 



3 www.isiknowledge.com. 
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Miller (2009).] The number of observations in the resample can be varied to 
create different time windows, as illustrated in Figure 1. The most natural 
question from a ranking correctness viewpoint is determining the behavior 
at the right tail; there are many institutions with mean at or near the hard 
threshold of zero, so there is little hope for ranking correctness in the left 
tail. Furthermore, the right tail appears to be long. Harvard University has 
an average of 67.5 papers per year, followed by means of 34.6, 29.6 and 28.2 
for Berkeley, Stanford and Cambridge, respectively. 

A natural question to ask is what the tail shape for this example might 
be. Approaches to estimating the shape parameter of a distribution with 
regularly varying tails, such as the method of Hill (1975), are unstable for 
these data; the number of extreme data for which a linear fit is plausible is 
very small, implying that the decay rate is faster than polynomial. Indeed, 
the left panel of Figure 3 shows the QQ plot of the observed data against 
a random variable with distribution function F(x) = 1 — exp(— 0.85a; 1 / 2 ), 
which suggests that an exponential tail might be reasonable for the data. If 
this is the case then the number of institutions that we expect to be ranked 
correctly should depend, to first order, only on n, not on p, and be of order 
up to n 1//4 (logra) 1//2 . One way to explore this further is to take jo as given, 
and to resample from the data, seeking, for example, the number of years, 
n, needed to obtain correct ranking of the first jo institutions at least 90% 
of the time. A plot of jo against ?i 1//4 (logn) 1//2 should be roughly linear. The 
right-hand panel of Figure 3 plots results of this experiment and appears 
to support the hypothesis. The flatness between jo = 3 and jo = 4 indicates 
that these two institutions are quite difficult to separate from each other. 




Fig. 3. The left panel is a QQ plot for the Nature data against the exponential distribu- 
tion. The right panel plots a transform of the number of years of data required to rank jo 
institutions correctly for various jo ■ 
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Example 2 (Continued). The Mann- Whitney test statistic can be writ- 
ten as 



where the Xj's and y^s are the observed values of the two samples. Notice 
that this statistic will have a hard lower threshold at n\ri2/2, where n\ and 
ri2 are the sizes of the two classes. Here, like the previous example, when 
the distributions differ only in location the difference has to be quite large 
to be detectable. Figure 4 shows the estimated density as well as the trun- 
cated normal density, which is the distribution that the scores would have 
if none of the genes had systematically different means for the two classes. 
This suggests that an assumption that the majority of genes is unrelated to 
whether the tissue is tumorous is not valid here. 

Bootstrapped versions of the dataset with different choices for n were 
created to indicate how many observations we need to obtain reasonable 
confidence in a ranking. Table 1 shows the probability that the set of the 
top j genes is identified correctly out of the 2000 for various j and n. Note 
that this is a slightly different statistic from the one in (2.5), since we allow 
any permutation of the top j genes to be detected. The results suggest that 
we have nearly a 50% chance of detecting the top gene if n = 250, and a 20% 
chance of correctly choosing the top four. The upper tail for this dataset 
again appears relatively light; the model F(x) = 1 — exp{— 0.19(x — l) 2 }, for 
x > 1, produces a good fit to the upper tail. 

Theorem 1 suggests that these probabilities should not depend on the 
choice of p. We can obtain a sense of this by randomly sampling, without 
replacement, p = 500 or p = 1000 genes from the original p = 2000, for each 
simulation; and recalculating the values in Table 1. For j = 4 and n = 250, 



max 









2 



3 



4 



5 



Scaled Mann-Whitney score 



Fig. 4. Estimated sampling density genes under the Mann-Whitney test for colon data. 



14 



P. HALL AND H. MILLER 



Table 1 

Probability that set of top j genes is correct for colon data 



3 






n 






62 


100 


150 


200 


250 


1 


0.251 


0.326 


0.437 


0.446 


0.490 


2 


0.067 


0.109 


0.166 


0.218 


0.277 


4 


0.022 


0.054 


0.094 


0.163 


0.193 


6 


0.007 


0.018 


0.035 


0.040 


0.068 



the respective probabilities were 0.183 and 0.170, quite close to the value 
0.193 observed for p = 2000. While the equivalence appears good for j > 4, 
there are larger departures for j = 1 or 2, where the initial results for this 
particular realization tend to distort the calculation. 

Example 3 (School rankings). A third example of accuracy in the ex- 
tremes of a ranking is based on student performance at 75 private schools 
in NSW, Australia. For each school the number of final year exams taken, 
and the number of these where a score of at least 90% was achieved, were 
recorded. The proportion of exams where 90% or more was scored can 
be used to rank the schools, and prediction intervals can be constructed 
by resampling from appropriate binomial distributions. The results in Fig- 
ure 5 indicate the increased confidence we can have in the upper extreme, 
with the top school identified with reasonable certainty. In this example, 
the possible range of scores for ranking has finite support, being restricted 
to the interval [0,1]; thus it is a context where Theorem 2 is applica- 
ble. 

Hill's (1975) estimator of a, when (3.2) holds, is relatively stable in 
this example and suggests that a ~ 6. From (3.10), we can calculate that 
(n a / 2 /p) 1 /( 2a_1 ) ss 4, which is consistent with a small number of schools 
being correctly ranked. If the number were large, then we would expect a 
significant portion of the schools to be ranked with a high degree of accuracy. 
In the case of these data, however, the small value suggests that it might 
not be possible to obtain any correct ranks. 

Example 4 (Simulation with exponential tails and infinite support). 
Here, we simulate increasing n and p in the case of exponential tails. For a 
given n, set p = 0.0005n 2 , let the Gj's be drawn from a standard exponential 
distribution and the Qy's be normal random variables with zero mean and 
standard deviation 3.5. Table 2 shows the results of 1000 simulations for 
various values of n, approximating (2.5) for different choices of jo. Theorem 
1 suggests that the results should converge to 1 if jo = o(n 1 / 4 ), and degrade, 
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FlG. 5. Rankings of schools by students' exam performance with prediction intervals. 



otherwise. This appears consistent with the results. The difficulty of the 
problem due to the quadratic growth of p and the large error in Qij is also 
evident; even when jo = 1 and n is large, reliable prediction of the top rank 
is not assured. 

Example 5 (Simulation with polynomial tails and infinite support). We 
use the same setup as in the previous example, except that the generating 
distribution for the ©j's is Pareto, F(x) = 1 — x~ a for x > 1, with a = 4. 
Theorem 1 and (3.4) suggest that the rate n 4 / 18 p 1//9 = n 4//9 is critical for jo, 
and this is consistent with the results in Table 3. This is an easier problem 
than that in the previous example, because of the polynomial decay of the 
tail. For instance, the top right-hand result in the table suggests that the 
top nine ranks can be correctly ascertained more than 90% of the time when 

Table 2 

Probability that the first jo rankings are correct in the case of exponential tails 



n 



jo 


500 


1000 


2000 


5000 


10,000 


20,000 


50,000 


1 


0.909 


0.9365 


0.959 


0.970 


0.9745 


0.9840 


0.9910 


n ' 15 


0.764 


0.823 


0.767 


0.844 


0.897 


0.872 


0.890 


n ' 20 


0.591 


0.700 


0.655 


0.683 


0.667 


0.664 


0.743 


n - 25 


0.420 


0.406 


0.424 


0.383 


0.334 


0.402 


0.428 


n 0.30 


0.183 


0.188 


0.180 


0.116 


0.101 


0.079 


0.069 


n 0.35 


0.056 


0.030 


0.021 


0.004 


0.002 


0.000 


0.001 
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Table 3 



Probability that the 


first j 


rankings are 


correct in the 


case of 


exponential tails 


jo 








n 








500 


1000 


2000 


5000 


10,000 


20,000 


50,000 


(l/5)n°' 35 


0.884 


0.832 


0.908 


0.920 


0.898 


0.921 


0.945 


(l/5)n 040 


0.694 


0.672 


0.708 


0.731 


0.801 


0.786 


0.803 


(l/5)n 4/9 


0.477 


0.510 


0.586 


0.568 


0.569 


0.520 


0.540 


(l/5)n - 50 


0.283 


0.242 


0.252 


0.161 


0.140 


0.120 


0.096 


(l/5)n ' 55 


0.071 


0.086 


0.031 


0.020 


0.006 


0.002 


0.001 



p > 50,000, whereas the figure 0.890 in the last column of Table 2 suggests 
that, for the distribution represented there, only the top five ranks have this 
level of reliability. 

Example 6 (Simulation with polynomial tails with finite support). The- 
orem 2 has many interesting consequences, but the present example focuses 
on case (iii), where a > \. First, let the Oj's be uniformly distributed on 
[0,1], and consider a case where the entire ranking is correct. Using the 
notation of Section 3 and taking a = 1, Theorem 2 implies that p x n 1 / 4 
defines the critical growth in dimension. For simulation, we took p = 2n k for 
various k, and scaled the (normally distributed) error for each k such that 
the n = 500 case had probability approximately 0.5 of correctly identifying 
all ranks. Each simulation was repeated 10,000 times, with results summa- 
rized in Table 4. As predicted, growth rates in dimension slower than n 1 / 4 
have probability of correct ranking tending to 1, while those faster than n 1 / 4 
degrade. 

Next, we examine the case p = 5 x 10 _6 n 2 , where dimension grows at a 
quadratic rate; and F(x) = x a on [0, 1], with a = 6, implying a reasonably 
severe tail. Theorem 2 suggests that if j$ = o(p 1//22 ), or equivalently if jo = 

Table 4 

Probability all ranks identified correctly when Qj is uniformly distributed 



n 



k 


500 


1000 


2000 


5000 


10,000 


20,000 


50,000 


1/6 


0.502 


0.494 


0.525 


0.593 


0.635 


0.658 


0.701 


1/5 


0.498 


0.511 


0.471 


0.558 


0.568 


0.578 


0.606 


1/4 


0.497 


0.478 


0.492 


0.505 


0.517 


0.496 


0.502 


1/3 


0.500 


0.457 


0.395 


0.343 


0.289 


0.259 


0.212 


1/2 


0.502 


0.369 


0.249 


0.107 


0.046 


0.011 


0.000 
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Table 5 

Probability that lowest 10n k scores identified correctly 



k 










n 








5 x 10 3 


1 X 10 4 


2 X 10 4 


5 X 10 4 


i x io 5 


2 X 10 5 


5 X 10 5 


1 x 10 6 


0.05 


0.500 


0.539 


0.553 


0.583 


0.603 


0.609 


0.628 


0.641 


0.07 


0.502 


0.532 


0.506 


0.546 


0.558 


0.580 


0.555 


0.591 


1/11 


0.497 


0.486 


0.489 


0.516 


0.489 


0.463 


0.513 


0.496 


0.11 


0.497 


0.481 


0.471 


0.432 


0.461 


0.447 


0.452 


0.421 


0.13 


0.506 


0.492 


0.461 


0.481 


0.445 


0.427 


0.387 


0.385 



o(n 1 / 11 ), then (2.5) should hold. Table 5 shows the probability of ranking 
the smallest jo = 10n k scores correctly for various k and n, with 10,000 
simulations. Again the normal error is tuned so that the n = 5000 case has 
probability of close to |. The results suggest that n 1 / 11 indeed separates 
values of k for which correct ranking is possible. 

5. Technical arguments. We begin by giving a brief sketch of the proof 
of Theorem 1. Two steps in the proof are initially presented as lemmas, the 
first using moderate deviation properties to approximate sums related to the 
object of interest, and the second employing Taylor's expansion applied to 
Renyi representations of order statistics to show that the gaps ©(j+i) — ©(j) 
have a high probability of being of reasonable size. In the proof itself, we use 
Lemma 1 to bound the probability in (2.5) from below [see (5.19)] and then 
show that the last two terms in this expression converge to zero, implying 
that the probability converges to 1 if (3.6) holds. For the converse, assuming 
independence, we find an upper bound to the probability in (5.20) and show 
that if this probability tends to one then the sum s(n), introduced at (5.21), 
must converge to zero, which in turn implies (3.6). Only the exponential tail 
case is presented in detail; comments at the end of the proof describe the 
main differences in the polynomial tail case. 

Throughout, we let £ (jo) denote the event that Qr^ + @r. > QR jQ + ®R jo 
for jo + 1 < j < p, we define £j to be the event that ©(j+i) — ©(j) > —{Qr j+1 — 
Qrj), and we take £ (jo) and £j to be the respective complements. Also, we 
let (j = — ©(j) denote the jth gap, where 0( O ) = — oo for convenience. 

In Lemma 1 below, we write O to denote the sigma-field generated by 
the ©j's, for a standard normal random variable independent of O, S n 
for any given sequence of positive constants 5 n converging to zero, and A 
for a generic random variable satisfying P(|A| < <5 n ) = 1. 
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Lemma 1. For any positive integer jo <p, let J denote the set of posi- 
tive, even integers less than or equal to jo- Put 



min(Cj-i,Cj) 
2(varQp.)V2' 



[ 2j 







{var(Qj 



Then 



(5.1) 



Jo ( 
.7 = 1 L 



j^-l > -min(Ci-i,0) 



JO 



= 2{l + o(l)}^P(|iV|>r lj ) + o(l). 
j=l 

7/ m addition the components of the Qi 's are independent then 



E 



(5.2) 



<{l + o(l)}P 



exp<| -(1 + A) P ( N > T 2j I O) 



Proof. Using the arguments of Rubin and Sethuraman (1965) and 
Amosova (1972), it can be shown that, if the constant C 2 in (3.3) satis- 
fies C 2 > B 2 + 2 where B > 0, then as n (and hence, also p) diverges, 

(5.3) P{|Qy| > ^(varQj) 1 / 2 } = {1 + o(1)}2{1 - 

(5.4) P[-(Q n - Q n ) > x{var(Q,- 1 - Q n )} 1 ' 2 ] = {l + o(1)}{1 - *(*)}, 

uniformly in < x < P(logp) 1 / 2 and J, ji,j2 > 1 such that ji 7^ j'2. Ex- 
pression (5.4) requires the independence assumption. Therefore, since C 2 > 
2(Ci + 1) in (3.3), we can take P = (2 -he) 1 / 2 for some e > 0, and then (5.3) 
and (5.4) hold uniformly in < x < {(2 + e) logp} 1 / 2 . Thus, as n — > 00, they 
hold uniformly in all x > 0, modulo an o(p~ 1 ) term. We use (5.3) to derive 
(5.1), while (5.4) implies that 

£ p(Sj) = {1 + o(i)} 2 p(jv > r 2j ) + (i), 



which leads to (5.2). □ 



Lemma 2. If (3.1), indicating the case of exponential tails, holds then 
there exist B^,B^ > such that, for any choice of constants ci,c 2 satisfying 
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< c\ < C2 < (4 — e)" 1 with e as in (3.5), and for all Bq > 0, 

(5.5) inf PiCjZr^logn) 1 '^ > B 4 n' Cl } = 1 - 0{n~ B(i ), 

,7'S[l,n c i] 

(5.6) inf P{£ 4 < JCjZ-^logn) 1 -^ < B 5 } = 1 - O^ 6 ). 

Note further that the constraint on C2 permits n C2 to be of size ^ e xp^ £l (where 

El > 0). 



Proof. If t/m < • ■ • < denote the order statistics of a sample of 
size p drawn from the uniform distribution on [0, 1] then, for each p, we can 
construct a collection of independent random variables Z\ , . . . , Z p with the 
standard negative exponential distribution on [0, oo] , such that, for 1 < j < p, 
?7(j) = 1 — exp(— Vj) where 

Zk 



V = V r — = wi + Wj. 

3 ^p-k + 1 3 3 



k=l 



For details, see Renyi (1953). Further, uniformly in 1 < j < ^p and 2 < p < 

oo, 

P -t 

1 3 



(5.7) 



£ - = J - + 0(f/p 2 ) = 0(j/p), 
^-^ k p 



k=p-j+l 



(5.8) Wj= V k~ l (Z p _ k+ i - 1), 

fc=p-j+i ^^f/ 2 

1 P 



(5.9) 



sup j 

i<i<p/2 



-3/2 



P 



sup j~ 1/2 \Wj\ <p -1 W(p), 



<P~ 2 W(p), 



k=p—j+i 



where the nonnegative random variable W(p), which without loss of gen- 
erality, we take to be common to (5.8) and (5.9), satisfies the expression 
P{W(p) > p £ } = 0(p~ c ) for each C,e > 0. 

Using the second identity in (5.7), and (5.8), we deduce that 



U U+1) - U 0) = (V j+1 - Vj){l - \(V J+1 + Vj) 



(5.10) 



+ -Av* +l + v 3 v J+l + v 2 



p-j 



l + ^il - + 



J'l 



p p 



1/2 
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uniformly in 1 < j < \p, where the random variable ^fji satisfies, for k = 1, 



(5.11) P( max \V ik \ < A = 1, 

Vl<j<p/2 / 

A > is an absolute constant, and for each C, e > the nonnegative random 
variable Sj\ satisfies, with k = 1, 



(5.12) P( sup 5 jfc >/) =0(p- 

V l<7<p/2 7 



C\ 



y<j<p/2 

Using the third identity in (5.7) and (5.9), we deduce that 

1 

2 ( 



0<U U) = w J + W j --(w j + W,) 2 + 



(5.13) 



p \p* p 

where ^2 and 5,2 > satisfy (5.11) and (5.12), respectively. 

Define Dj = J7y+i) — Uu\ and without loss of generality, Co = 1 in (3.1). 
If the common distribution function of the Qj's is F, then by Taylor's ex- 
pansion, 

C j = F- 1 (U (j) +D j )-F- 1 (U U) ) 



(5.14) =D j {F-')'(U {j) +u ] D 



j ■ 



where < u j < 1 and the last line makes use of (3.1). The random variable 
tyj satisfies, for constants B\, B2 and B3 satisfying < B\ < B2 < 00 and 
< B 3 < 1, 

-P(-Bi < *j < -B2 for all j such that E/y+i) < B 3 ) = 1. 
The required result then follows from (5.10), (5.13) and (5.14). □ 

Proof of Theorem 1. Take jo < P a positive integer. Note that, taking 
£(jo), £j, £(jo), £j, C an d J as for Lemma 1, 

{Rj = R 3 for 1 <i < jo} 

5 {IQij, I < I min(C J _i, 0) for 1 < j < j } n £(jo), 
where we define = —00 if j = 1 as before. Therefore, defining 7r(jo) = 

P(Rj = Rj for 1 < j < jo), we deduce that 

jo ( 1 1 

(5.15) vr(jo) > 1 -J2 P \ fel > o min (0-i.0) " ^(io)}. 

.7 = 1 J 
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Also, 



{Rj = Rj for l<j<i } 

= {X Rl <■■■< X R . and Xj > X R . for j £ {R 



R jo}} 



{(j > -(Q 

and 6 , 



R 



■3+1 



Q Rl ) for 1 < j < jo 



©(jo) > 



-(Qj - QrJ for j£{Ri,.. -,Rj }}, 



and so 
(5.16) 



7r(io) < ^{Ci > - Q^) for 1 < j < jo}- 

Letting vri(jo) denote the probability that £j holds for all j € J ", by (5.16), 
(5.17) 7r(j )<7ri(io). 

Note that if the components of each Qi are independent, then the events 
£j, for j € J", are independent conditional on O. Therefore, 



(5.18) 



o 



E 



no 



P(£j | O)} 



< E 



expj-^P^IO)} 
1 jej > 



Using Lemma 1, we have the following inequalities regarding 7r(j'o 



jo 



(5.19) vr(io) > 1 - 2{1 + o(l)} VP(|iV| > Ty) - P{£(j )} + o(l), 



(5.20) vr(io)< {l + o(l)}E 



expj-(l + A)^P(iV>r 2 ,|C) 

To show that (3.6) implies (2.5), by (5.19) it is sufficient to show that 
P{£(j )} and YfJLi P (\ N \ > T ij) ar e both o(l), which we shall do in turn. 

Define 1 = (logji)^ 1 /"^ -1 , let iV be a standard normal random variable 
independent of O, and let Z be independent of N and have the standard 
negative exponential distribution. Let K\ be a positive constant. If a n is a 
sequence of positive numbers and f n is a sequence of nonnegative functions, 
write a n = f n {K) to mean that, for constants L\,L2 > 1, either (a) a n < 
Lif n (K) whenever K > L 2 and n is sufficiently large, and a n > L^ 1 f n (K) 
whenever K < L^ 1 and n is sufficiently large, or (b) a n > L^ l f n (K) when- 
ever K > L 2 and n is sufficiently large, and a n < Lif n (K) whenever K < L^ 1 
and n is sufficiently large. Let < c\ < c 2 < \ and c\ < j, and let jo and j\ 
denote integers satisfying \j\ — n Cl \ < 1, j\ < jo < n C2 and ji/jo — ^ 0. 



22 



P. HALL AND H. MILLER 



When (3.1) holds with Cq = 1, Lemma 2 implies that, for each Bq > 
and letting 7 j = n _1 / 2 j'£ _1 , 



Jo 



s(n) = J2P{\N\>K 1 n 1 / 2 (C j )} 



3=1 



0{j 1 P(\N\>K 2 Z 1 - 1 )+n- B «}+ £ P{\N\>KZ 1 j 1 ) 

jl<j<30 



0{jAP{Z< lh ) + E 



(5.21) 



+ E (p(Z<13)+E Z- 1 lj e^- 1 -{KZ 1 T^I(Z> lj ) ) 

Z- 1 7 il exp|-^^ 7 r 1 ) 2 |l(Z>7 J1 ) )} 
i(KZ 7 ^) 2 W> 7j ) 



ji<j<jo 



h<j<3o 



Now, 



i?^- 1 7 J exp|--(KZ 7 7 1 ) 2 |/(Z> 7 ,) 
= * _1 7i expj-i^Z^ 1 ) 2 - z| 
= 7i^ u^expj-^AV) 2 - 7i u|ciu>; 7i 

(Here, we have used the fact that j < jo < ^ C2 where C2 < i.) Therefore, 
s(n)^j 1 -n~ 1 ^j 1 r 1 + ^ n~ l l 2 jr l 

31<3<30 

(5.22) xn-V+n-^f 1 

(Here, we have used the fact that ji/j'o — > 0.) 

The right-hand side of (5.22) converges to zero if and only if (3.6) holds. 
Moreover, in view of the fact that 



P(\N\ >T lj ) <P[ \N\ > 



2(varQp.)V2 



+ P( \N\ > 







2(varQfl.) 1 /V' 
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and depending on the choice of K\ in the definition of s(n) at (5.21), s(n) 
can be an upper bound to the series Ylj =iP(\^\ > ^lj) on the right-hand 
side of (5.1). Hence, 

30 

(5.23) Y,P(\N\>T lj ) = o(l). 

This deals with the second term on the right-hand side of (5.19). Similarly, 
if r € [2, oo) is a fixed integer, and if jo = o(n 1 / 4 £ 1//2 ), then 

jo+r-l 

(5.24) ai (n)= ]T P{|iV|>K 1 n 1/2 (C J )} = o(l). 

3=30+ 1 

Moreover, if j\ denotes the integer part of n C2 — jo then, for constants K 2 
and K3 satisfying K\ > K 2 > K3 > 0, and for any B > 0, 

30+h 

S2 (n)= £ P{|iV|>lW /2 (e (j . +1) -e (io) )} 



n B ) 



j=r I k=l ) 

(5.25) 

< hP{\N\ > K 2 n 1 / A l l '\Z 1 + ■■■ + Z r )} + 0(n~ B ) 

= 0{j 1 (n l / 2 £ 2 y r }, 

where we have assumed that jo = o(n 1 / 4 ^ 1 / 2 ) and also used the fact that 

Z\ H + Z r has a gamma(r, 1) distribution. If we choose r so large that 

p n -r/2 _ o(n~ £ ) for some e > 0, then we can deduce from (5.24) and (5.25) 
that si(n) + S2(n) — > 0, and hence, by (5.6), that 

n c 2 

(5.26) ]T p(Q Rj + e Rj > q Rjo + e Rjo ) -> 0. 

i=io+i 

A more crude argument can be used to prove that if r is so large that 
p2 n -r/2 _ Q( n -e) f or som e e > 0, and if jo = o(n 1//4 ^ 1//2 ), then 

(5.27) £ p^+Q^xg^+e^J^O. 

n c 2 <j<p 

Together, (5.26) and (5.27) imply that if j = o(re 1/4 £ 1/2 ) then 

(5.28) ^{£(io)} -> 0. 

Thus, in light of (5.19), we see (5.23) and (5.28) imply that (3.6) is suffi- 
cient for (2.5). 
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We next show that (2.5) implies (3.6) in the independent case. If (2.5) 
holds, then by (5.20), 

P(N > T 2j \O)->0 

in probability. Therefore, by Lemma 2, with jo and j\ as above, there exists 
K\ > such that 

P{\N\>n 1 / 2 K 1 (( j )\O}^0 

ji<j<jo 

in probability. (We can take the sum over all j € [j\ + 1, jo], rather than just 
over even j, since (5.2) holds for sums over odd j as well as over even j.) 
Hence, arguing as in the lines below (5.21), we deduce that for sufficiently 
large K 2 > 0, 

(5.29) ]T /(Zj/S^^O 

h<j<jo 

in probability, where the random variables Zj are independent and have a 
common exponential distribution, 5j = nT^^jl -1 and 

f(z) = z- 1 exp(-K 2 z 2 )I(z>l). 

We claim that this implies that the expected value of the left-hand side of 

(5.29) also converges to 0: 

(5.30) ]T EifiZj/Sj^^O 

ji<j<jo 

or equivalently that ^2j 1< j < j Sj — > 0, and thence [using the argument leading 
to (5.22)] that s(n) X n -1 / 2 ^ -1 0, which is equivalent to (3.6). There- 
fore, if we establish (5.30) then we shall have proved that (2.5) implies (3.6). 

It remains to show that (5.29) implies (5.30). This we do by contradiction. 
If (5.30) fails then, along a subsequence of values of n, the left-hand side 
of (5.30) converges to a nonzero number. For notational simplicity, we shall 
make the inessential assumptions that the number is finite and that the 
subsequence involves all n, and we shall take K 2 = 1 in the definition of /. 
In particular, 

(5.31) t(n)= £{iW^)}^(oc), 

ji<j<jo 

where t(oo) is bounded away from 0. Now, tin) = {1 + o(l)}u(l)5(n), where 

<5( n ) = Z)ji<y<j anc *> f° r g enera l A > 1) m(-V) = f z> \ z ~ l exp(-z 2 ) dz. There- 
fore, 



(5.32) 



6{n)^5{oo) = t(oo)/fi(l). 
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For each A > 1 the left-hand side of (5.29) equals Ai + A2, where, in view 
of (5.31), 



(5.33) 
and 



E(A 2 )= ]T E{f(Zjl&iWi>Mj)} 
h<3<k 

= {l + o(l)} M (A)<S(n) 



Ai= E f(Z j /S j )I(Z j <\S j )= £ f(Wj)Ij 
h<i<io ji<j<jo 



with Wj = Zj/dj and Ij = I(5j < Zj < X5j). However, 

£ p l!i = l )= + o(l) = 5(c»)/xi(A) + o(l), 

ji<j<h 

where /ii(A) = / 1<z< ^ z exp(— z 2 ) dz. Therefore, in the limit asn-> 00, Ai 
equals a sum, Sa say, of N independent random variables each having the 
distribution of f{W), where W is uniformly distributed on [1, A], N has a 
Poisson distribution with mean <5(oo)//i(A), and N and the summands are 
independent. The distribution of 5a is stochastically monotone increasing, in 
the sense that P(S\ > s) increases with A. On the other hand, since //(A) — > 
as A — > 00 then, by (5.32) and (5.33), 

lim limsupi?(A2) = 0. 

A— >oo n .->oo 

Combining these results, we deduce that Ai + A2, that is, the left-hand side 
of (5.29), does not converge to zero in probability. This contradicts (5.29) 
and so establishes that t(oo) must equal zero; that is, (5.30) holds. 

Comments on proving the polynomial case: the proof for the case of poly- 
nomial tails proceeds similarly. The main difference is that in the proof of 
Lemma 2 we use (3.2) instead of (3.1), which forces a factor of p~ l / a into 
the results of the lemma, rather than (logn) 1- ^ 1 /") . This in turn implies 
that s(n) x n~ l / 2 j^ +l l a p~ l / a , entailing that convergence occurs if (and, in 
the case of independence, only if) jo = o(v po \), as required. □ 



REFERENCES 

Alon, U., Barkai, N., Notterman, D. A., Gish, K., Ybarra, S., Mack, D. and 
Levine, A. J. (1999). Broad patterns of gene expression revealed by clustering analysis 
of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. 
Sci. 96 6745-6750. 

Amosova, N. N. (1972). Limit theorems for the probabilities of moderate deviations. 
Vestnik Leningrad. Univ. No. 13 Mat. Meh. Astronom. Vyp. 5-14, 148. MR0331484 



26 



P. HALL AND H. MILLER 



Barker, L. E., Smith, P. J., Gerzoff, R. B., Luman, E. T., McCauley, M. M. and 
Strine, T. W. (2005). Ranking states' immunization coverage: An example from the 
National Immunization Survey. Stat. Med. 24 605-613. MR2134528 

Brijs, T., Van Den Bossche, F., Wets, G. and Karlis, D. (2006). A model for 
identifying and ranking dangerous accident locations: A case study in Flanders. Statist. 
Neerlandica 60 457-476. MR2291385 

Brijs, T., Karlis, D., Van Den Bossche, F. and Wets, G. (2007). A Bayesian model 
for ranking hazardous road sites. J. Roy. Statist. Soc. Ser. A 170 1001-1017. MR2408989 

Cesario, L. C. and Barreto, M. C. M. (2003). Study of the performance of bootstrap 
confidence intervals for the mean of a normal distribution using perfectly ranked set 
sampling. Rev. Mat. Estatist. 21 7-20. MR2058492 

Chen, H., Stansy, E. A. and Wolfe, D. A. (2006). An empirical assessment of ranking- 
accuracy in ranked set sampling. Comput. Statist. Data Anal. 51 1411-1419. MR2297530 

Corain, L. and Salmaso, L. (2007). A non-parametric method for defining a global 
preference ranking of industrial products. J. Appl. Statist. 34 203-216. MR2364253 

Goldstein, H. and Spiegelhalter, D. J. (1996). League tables and their limitations: 
Statistical issues in comparisons of institutional performance. J. Roy. Statist. Soc. Ser. 
A 159 385-443. 

Hall, P. and Miller, H. (2009). Using the bootstrap to quantify the authority of an 
empirical ranking. Ann. Statist. 37 3929-3959. MR2572448 

Hill, B. M. (1975). A simple general approach to inference about the tail of a distribution. 
Ann. Statist. 3 1163-1174. MR0378204 

Hui, T. P., Modarres, R. and Zheng, G. (2005). Bootstrap confidence interval esti- 
mation of mean via ranked set sampling linear regression. J. Stat. Comput. Simul. 75 
543-553. MR2162545 

Joe, H. (2000). Inequalities for random utility models, with applications to ranking and 
subset choice data. Methodol. Comput. Appl. Probab. 2 359-372. MR1836406 

Joe, H. (2001). Multivariate extreme value distributions and coverage of ranking proba- 
bilities. J. Math. Psych. 45 180-188. MR1820238 

Langford, I. H. and Leyland, A. H. (1996). Discussion of "League tables and their 
limitations: Statistical issues in comparisons of institutional performance" by Goldstein 
and Spiegelhalter. J. Roy. Statist. Soc. Ser. A 159 427-428. 

McHale, I. and Scarf, P. (2005). Ranking football players. Significance 2 54-57. 
MR2224085 

Mease, D. (2003). A penalized maximum likelihood approach for the ranking of college 
football teams independent of victory margins. Amer. Statist. 57 241-248. MR2016258 

Mukherjee, S. N., Sykacek, P., Roberts, S. J. and Gurr, S. J. (2003). Gene ranking- 
using bootstrapped p-values. Sigkdd Explorations 5 14-18. 

Murphy, T. B. and Martin, D. (2003). Mixtures of distance-based models for ranking 
data. Comput. Statist. Data Anal. 41 645-655. MR1973732 

Nordberg, L. (2006). On the reliability of performance rankings. In Festschrift for Tarmo 
Pukkila on His 60th Birthday (E. P. Liski, J. Isotalo, J. Niemela and G. P. H. Styan, 
eds.) 205-216. Univ. Tampere, Tampere, Finland. MR2412962 

Opgen-Rhein, R. and Strimmer, K. (2007). Accurate ranking of differentially expressed 
genes by a distribution-free shrinkage approach. Stat. Appl. Genet. Mol. Biol. 6 Art. 9, 
20pp. (electronic). MR2306944 

Quevedo, J. R., Bahamonde, A. and Luaces, O. (2007). A simple and efficient method 
for variable ranking according to their usefulness for learning. Comput. Statist. Data 
Anal. 52 578-595. MR2410003 



RANKING 



27 



Renyi, A. (1953). On the theory of order statistics. Acta Math. Acad. Sci. Hungar. 4 
191-232. MR0061792 

Rubin, H. and Sethuraman, J. (1965). Probabilities of moderate deviations. Sankhya 
Scr. A 27 325-346. MR0203783 

Taconeli, C. A. and Barreto, M. C. M. (2005). Evaluation of a bootstrap confi- 
dence interval approach in perfectly ranked set sampling. Rev. Mat. Estatist. 23 33-53. 
MR2304506 

Xie, M., Singh, K. and Zhang, C. H. (2009). Confidence intervals for population ranks 
in the presence of ties and near ties. J. Amer. Statist. Assoc. 104 775-787. MR2541594 

Department of Mathematics and Statistics 
University of Melbourne 
Melbourne, VIC 3010 
Australia 

E-MAIL : halpstatOms . unimelb.edu. au 
h. miller® ms.unimelb.edu.au 



